psychologywikiaorg-20200213-history
Yule
{(\rho-3)\;\rho}\, for \rho>3\, | kurtosis = \rho+3+\frac{11\rho^3-49\rho-22} {(\rho-4)\;(\rho-3)\;\rho}\, for \rho>4\, | entropy =| mgf = \frac{\rho}{\rho+1}\;{}_2F_1(1,1; \rho+2; e^t)\,e^t \, | char = \frac{\rho}{\rho+1}\;{}_2F_1(1,1; \rho+2; e^{i\,t})\,e^{i\,t} \, | }} In probability and statistics, the Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert Simon. Simon originally called it the Yule distribution. The probability mass function of the Yule–Simon (ρ'') distribution is : f(k;\rho) = \rho\,\mathrm{B}(k, \rho+1), \, for integer k \geq 1 and real \rho > 0 , where \mathrm{B} is the beta function. Equivalently the pmf can be written in terms of the falling factorial as : f(k;\rho) = \frac{\rho\,\Gamma(\rho+1)}{(k+\rho)^{\underline{\rho+1}}} , \, where \Gamma is the gamma function. Thus, if \rho is an integer, : f(k;\rho) = \frac{\rho\,\rho!\,(k-1)!}{(k+\rho)!} . \, The parameter \rho can be estimated using a fixed point algorithm. The probability mass function ''f has the property that for sufficiently large k'' we have : f(k;\rho) \approx \frac{\rho\,\Gamma(\rho+1)}{k^{\rho+1}} \propto \frac{1}{k^{\rho+1}} . \, This means that the tail of the Yule–Simon distribution is a realization of Zipf's law: f(k;\rho) can be used to model, for example, the relative frequency of the k th most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of k . Occurrence The Yule–Simon distribution arose originally as the limiting distribution of a particular stochastic process studied by Yule as a model for the distribution of biological taxa and subtaxa. Simon dubbed this process the "Yule process" but it is more commonly known today as a preferential attachment process. The preferential attachment process is an urn process in which balls are added to a growing number of urns, each ball being allocated to an urn with probability linear in the number the urn already contains. The distribution also arises as a continuous mixture of geometric distributions. Specifically, assume that W follows an exponential distribution with scale 1/\rho or rate \rho : : W \sim \mathrm{Exponential}(\rho)\, : h(w;\rho) = \rho \, \exp(-\rho\,w)\, Then a Yule–Simon distributed variable K has the following geometric distribution: : K \sim \mathrm{Geometric}(\exp(-W))\, The pmf of a geometric distribution is : g(k; p) = p \, (1-p)^{k-1}\, for k\in\{1,2,\dots\} . The Yule–Simon pmf is then the following exponential-geometric mixture distribution: : f(k;\rho) = \int_0^{\infty} \,\,\, g(k;\exp(-w))\,h(w;\rho)\,dw \, Generalizations The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function. The probability mass function of the generalized Yule–Simon(ρ'', α'') distribution is defined as : f(k;\rho,\alpha) = \frac{\rho}{1-\alpha^{\rho}} \; \mathrm{B}_{1-\alpha}(k, \rho+1) , \, with 0 \leq \alpha < 1 . For \alpha = 0 the ordinary Yule–Simon(ρ'') distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail. See also * Beta function * Preferential attachment Bibliography * Colin Rose and Murray D. Smith, Mathematical Statistics with Mathematica. New York: Springer, 2002, ISBN 0-387-95234-9. (See page 107, where it is called the "Yule distribution".) References Category:Discrete distributions